-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Online DDL: ensure high lock_wait_timeout
in Vreplication cut-over
#16601
Online DDL: ensure high lock_wait_timeout
in Vreplication cut-over
#16601
Conversation
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
go/vt/vttablet/onlineddl/executor.go
Outdated
@@ -1260,6 +1279,25 @@ func (e *Executor) initMigrationSQLMode(ctx context.Context, onlineDDL *schema.O | |||
return deferFunc, nil | |||
} | |||
|
|||
// initConnectionLockWaitTimeout sets the given locak_wait_timeout for the given connection, with a deferred value restoration function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lock is spelt wrong here (locak)
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #16601 +/- ##
==========================================
- Coverage 68.83% 68.82% -0.01%
==========================================
Files 1557 1558 +1
Lines 199983 200061 +78
==========================================
+ Hits 137655 137694 +39
- Misses 62328 62367 +39 ☔ View full report in Codecov by Sentry. |
@@ -578,6 +583,16 @@ func testScheduler(t *testing.T) { | |||
assert.NotEmpty(t, rs.Rows) | |||
}) | |||
|
|||
t.Run("low @@lock_wait_timeout", func(t *testing.T) { | |||
defer primaryTablet.VttabletProcess.QueryTablet("set global lock_wait_timeout=100000", keyspaceName, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does this 100000
value come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some high value. Let me change that into something more official. One sec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now setting to actual original lock_wait_timeout
.
go/vt/vttablet/onlineddl/executor.go
Outdated
// Set large enough `@@lock_wait_timeout` so that it does not interfere with the cut-over operation. | ||
// The code will ensure everything that needs to be terminated by `migrationCutOverThreshold` will be terminated. | ||
// Setting `lock_wait_timeout` here is done to just ensure MySQL does not interrupt prematurely. It is not | ||
// done by means of ensuring cleanup/termination of operations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what the It is not // done by means of ensuring cleanup/termination of operations.
part is about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to explain that lock_wait_timeout
has no role in the cut-over algorithm. It's not placed here so as to be part of the algorithm. We don't rely on MySQL to e.g. ensure proper rollback. The only reason we put it here is to ensure it is "higher than we'd run into".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can probably remove that whole sentence. The comment already says "The code will ensure ...", which is sufficient explanation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
go/vt/vttablet/onlineddl/executor.go
Outdated
func (e *Executor) initConnectionLockWaitTimeout(ctx context.Context, conn *connpool.Conn, lockWaitTimeout time.Duration) (deferFunc func(), err error) { | ||
deferFunc = func() {} | ||
|
||
if _, err := conn.Exec(ctx, `set @lock_wait_timeout=@@lock_wait_timeout`, 0, false); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, but IMO slightly clearer to always use the return var. i.e. not use :=
here. Behavior is the same though either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO it would add clarity and intention to specify global here and a comment for those not super famiilar:
// Set a connection level SQL variable to the current global value.
set @lock_wait_timeout=@@global.lock_wait_timeout
Similar for the defer:
set @@session.lock_wait_timeout=@lock_wait_timeout
(we are using @@session
already when we actually set it on line 1290)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, but IMO slightly clearer to always use the return var. i.e. not use := here. Behavior is the same though either way.
My take is the opposite, I always find it more comfortable to be explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now using explicit @@session.
at all places. In no place do I intend to use @@global.lock_wait_timeout
. I'm reading the current session value, modify it, restoring it. I don't touch the global setting.
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
go/vt/vttablet/onlineddl/executor.go
Outdated
// Set large enough `@@lock_wait_timeout` so that it does not interfere with the cut-over operation. | ||
// The code will ensure everything that needs to be terminated by `migrationCutOverThreshold` will be terminated. | ||
// Setting `lock_wait_timeout` here is done to just ensure MySQL does not interrupt prematurely. It is not | ||
// done by means of ensuring cleanup/termination of operations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can probably remove that whole sentence. The comment already says "The code will ensure ...", which is sufficient explanation.
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Description
Fixes #16591 by setting a high value for
lock_wait_timeout
in both cut-over sessions.Related Issue(s)
#16591
Checklist
Deployment Notes